System and method for analytically modeling data organized according to related attributes

ABSTRACT

A system and method for analytically modeling data with related attributes is disclosed. A single dimension is used to provide data according to each of the related attributes, and, thus, may be said to play the role of each related attribute depending on a received query. The measure of the analytical data model is tied to the dimension according to both data attributes to allow the measure to be analyzed by the dimension according to both attributes.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to systems and methods foranalytically modeling data organized and stored in a relationaldatabase, and, more particularly, to analytically modeling dataorganized according to related attributes.

[0003] 2. Description of the Prior Art

[0004] Online analytical processing (OLAP) is a key part of many datawarehouse and business analysis systems. OLAP services provide for fastanalysis of multidimensional information. For this purpose, OLAPservices provide for multidimensional access and navigation of data inan intuitive and natural way, providing a global view of data that canbe drilled down into particular data of interest. Speed and responsetime are important attributes of OLAP services that allow users tobrowse and analyze data online in an efficient manner. Further, OLAPservices typically provide analytical tools to rank, aggregate, andcalculate lead and lag indicators for the data under analysis.

[0005] In this context, an OLAP cube may be modeled according to auser's perception of the data. The cube may have multiple dimensions,each dimension modeled according to attributes of the data. Typically,there is a hierarchy associated with each dimension. For example, a timedimension can include years subdivided into months subdivided into weekssubdivided into days, while a geography dimension can include countriessubdivided into states subdivided into cities. Dimension members act asindices for identifying a particular cell or range of cells within thecube.

[0006] OLAP services are often used to analytically model data that isstored in a relational database such as, for example, an OnlineTransactional Processing (OLTP) database. Data stored in a relationaldatabase may be organized according to multiple tables with each tablehaving data corresponding to a particular data type. A tablecorresponding to a particular data type may be organized according tocolumns corresponding to data attributes. For example, datacorresponding to the type “Sales” may be organized in a “Sales” tablewith columns “Ship-to Customer ID”, “Bill-to Customer ID”, and “SaleQuantity”. Furthermore, data corresponding to the type “Customer” may beorganized in a “Customer” table with columns “Customer ID”, “Name”,“City”, and “State”.

[0007] The “Ship-to Customer ID” and “Bill-to Customer ID” attributes ofthe “Sales” table are related attributes because they bothcross-reference the “Customer ID” attribute of the “Customer” table. Foreach ship-to customer, data corresponding to the customer's “Name”,“City”, and “State” is stored in the “Customer” table on the row havingthe ship-to customer's “Customer ID”. Likewise, for each bill-tocustomer, data corresponding to the customer's “Name”, “City”, and“State” is stored in the “Customer” table on the row having the bill-tocustomer's “Customer ID”.

[0008] One issue that arises with regard to analytically modeling datafrom a relational database is how to best take into consideration datawith such related attributes. In existing methods for analyticallymodeling data with related attributes, a plurality of dimensions eachprovides data to one of the related attributes. For example, an OLAPcube may be modeled according to data stored in the “Sales” and“Customer” tables of a relational database. The cube may have a firstdimension modeled according to the “Customer” type and providing dataaccording to the “Ship-to Customer” attribute and a second dimensionmodeled according to the “Customer” type and providing data “Bill-toCustomer” attribute.

[0009] Modeling two dimensions that each provide data to one of therelated attributes is a complex and time-consuming process because, foreach dimension, data must be retrieved from multiple tables. Thecomplexity and time required to model the cube would be greatly reducedif, rather than having two dimensions that each provide data to one ofthe related attributes, the cube has a single dimension that providesdata to both related attributes. Thus, there is a need in the art for asystem and method for analytically modeling data with relatedattributes, the system and method having a single dimension providingdata to a plurality of related attributes.

SUMMARY OF THE INVENTION

[0010] Accordingly, in the present invention, a system and method foranalytically modeling data with related attributes is disclosed. In arelational database, a first table organizes a first type according to afirst attribute and a second attribute, and a second table organizes asecond type according to a third attribute. The first attribute of thefirst table is related to the third attribute of the second table suchthat the first table may be cross-referenced to the second tablethereby. The second attribute of the first table is related to the thirdattribute of the second table such that the first table may becross-referenced to the second table thereby.

[0011] The data stored in the relational database is analyticallymodeled. A measure is modeled according to the first type of the firsttable. A dimension is modeled according to the second type of the secondtable. The measure is tied to the dimension according to the firstattribute of the first table and the third attribute of the second tableto allow the measure to be analyzed by the dimension according to thefirst attribute. The measure is also tied to the dimension according tothe second attribute of the first table and the third attribute of thesecond table to allow the measure to be analyzed by the dimensionaccording to the second attribute. Thus, the dimension provides dataaccording to both the first attribute of the first table and the secondattribute of the first table.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] The illustrative embodiments will be better understood afterreading the following detailed description with reference to theappended drawings, in which:

[0013]FIG. 1 is a block diagram representing a general purpose computersystem in which aspects of the present invention and/or portions thereofmay be incorporated;

[0014]FIG. 2 is a sample relational database table corresponding to“Sales” data;

[0015]FIG. 3 is a sample relational database table corresponding to“Customer” data;

[0016]FIG. 4 is a prior art analytical data cube derived from the tablesof FIGS. 2 and 3;

[0017]FIG. 5 is an analytical data cube derived from the tables of FIGS.2 and 3 in accordance with one embodiment of the present invention; and

[0018]FIG. 6 is a hierarchical data tree showing data organized in adimension according to a plurality of gradations.

DETAILED DESCRIPTION

[0019] A system and method for analytically modeling data with relatedattributes is disclosed below with reference to the aforementioneddrawings. Those skilled in the art will readily appreciate that thedescription given herein with respect to those drawings is forexplanatory purposes only and is not intended in any way to limit thescope of the invention to the specific embodiments shown. Throughout thedescription, like reference numerals are employed to refer to likeelements in the respective figures.

[0020] Computer Environment

[0021]FIG. 1 and the following discussion are intended to provide abrief general description of a suitable computing environment in whichthe present invention and/or portions thereof may be implemented.Although not required, the invention is described in the general contextof computer-executable instructions, such as program modules, beingexecuted by a computer, such as a client workstation or a server.Generally, program modules include routines, programs, objects,components, data structures and the like that perform particular tasksor implement particular abstract data types. Moreover, it should beappreciated that the invention and/or portions thereof may be practicedwith other computer system configurations, including hand-held devices,multi-processor systems, microprocessor-based or programmable consumerelectronics, network PCs, minicomputers, mainframe computers and thelike. The invention may also be practiced in distributed computingenvironments where tasks are performed by remote processing devices thatare linked through a communications network. In a distributed computingenvironment, program modules may be located in both local and remotememory storage devices.

[0022] As shown in FIG. 1, an exemplary general purpose computing systemincludes a conventional personal computer 120 or the like, including aprocessing unit 121, a system memory 122, and a system bus 123 thatcouples various system components including the system memory to theprocessing unit 121. The system bus 123 may be any of several types ofbus structures including a memory bus or memory controller, a peripheralbus, and a local bus using any of a variety of bus architectures. Thesystem memory includes read-only memory (ROM) 124 and random accessmemory (RAM) 125. A basic input/output system 126 (BIOS), containing thebasic routines that help to transfer information between elements withinthe personal computer 120, such as during start-up, is stored in ROM124.

[0023] The personal computer 120 may further include a hard disk drive127 for reading from and writing to a hard disk (not shown), a magneticdisk drive 128 for reading from or writing to a removable magnetic disk129, and an optical disk drive 130 for reading from or writing to aremovable optical disk 131 such as a CD-ROM or other optical media. Thehard disk drive 127, magnetic disk drive 128, and optical disk drive 130are connected to the system bus 123 by a hard disk drive interface 132,a magnetic disk drive interface 133, and an optical drive interface 134,respectively. The drives and their associated computer-readable mediaprovide non-volatile storage of computer readable instructions, datastructures, program modules and other data for the personal computer120.

[0024] Although the exemplary environment described herein employs ahard disk, a removable magnetic disk 129, and a removable optical disk131, it should be appreciated that other types of computer readablemedia which can store data that is accessible by a computer may also beused in the exemplary operating environment. Such other types of mediainclude a magnetic cassette, a flash memory card, a digital video disk,a Bernoulli cartridge, a random access memory (RAM), a read-only memory(ROM), and the like.

[0025] A number of program modules may be stored on the hard disk,magnetic disk 129, optical disk 131, ROM 124 or RAM 125, including anoperating system 135, one or more application programs 136, otherprogram modules 137 and program data 138. A user may enter commands andinformation into the personal computer 120 through input devices such asa keyboard 140 and pointing device 142. Other input devices (not shown)may include a microphone, joystick, game pad, satellite disk, scanner,or the like. These and other input devices are often connected to theprocessing unit 121 through a serial port interface 146 that is coupledto the system bus, but may be connected by other interfaces, such as aparallel port, game port, or universal serial bus (USB). A monitor 147or other type of display device is also connected to the system bus 123via an interface, such as a video adapter 148. In addition to themonitor 147, a personal computer typically includes other peripheraloutput devices (not shown), such as speakers and printers. The exemplarysystem of FIG. 1 also includes a host adapter 155, a Small ComputerSystem Interface (SCSI) bus 156, and an external storage device 162connected to the SCSI bus 156.

[0026] The personal computer 120 may operate in a networked environmentusing logical connections to one or more remote computers, such as aremote computer 149. The remote computer 149 may be another personalcomputer, a server, a router, a network PC, a peer device or othercommon network node, and typically includes many or all of the elementsdescribed above relative to the personal computer 120, although only amemory storage device 150 has been illustrated in FIG. 1. The logicalconnections depicted in FIG. 1 include a local area network (LAN) 151and a wide area network (WAN) 152. Such networking environments arecommonplace in offices, enterprise-wide computer networks, intranets,and the Internet.

[0027] When used in a LAN networking environment, the personal computer120 is connected to the LAN 151 through a network interface or adapter153. When used in a WAN networking environment, the personal computer120 typically includes a modem 154 or other means for establishingcommunications over the wide area network 152, such as the Internet. Themodem 154, which may be internal or external, is connected to the systembus 123 via the serial port interface 146. In a networked environment,program modules depicted relative to the personal computer 120, orportions thereof, may be stored in the remote memory storage device. Itwill be appreciated that the network connections shown are exemplary andother means of establishing a communications link between the computersmay be used.

[0028] System and Method of the Present Invention

[0029] An analytical data service such as, for example, On-LineAnalytical Processing (OLAP) may be employed to model data stored in arelational database such as, for example, an On-Line TransactionalProcessing (OLTP) database. As set forth previously, data stored in arelational database may be organized according to multiple tables, witheach table having data corresponding to a particular data type. A tablecorresponding to a particular data type may be organized according tocolumns corresponding to data attributes. One such table is shown inFIG. 2, with data corresponding to the type “Sales” organized in a“Sales” table 200 with columns “Ship-to Customer ID” 210, “Bill-toCustomer ID” 212, and “Sale Quantity” 214 and rows corresponding toindividual sales entries. Another such table is shown in FIG. 3, withdata corresponding to the type “Customer” organized in a “Customer”table 300 with columns “Customer ID” 310, “Name” 312, “City” 314, and“State” 316 and rows corresponding to individual customer entries.

[0030] The attributes “Ship-to Customer ID” 210 and “Bill-to CustomerID” 212 from “Sales” table 200 are related attributes because they bothcross-reference the “Customer ID” attribute 310 from “Customer” table300. That is, each ship-to customer in “Sales” table 200 is referencedaccording to a “Customer ID” 310 present in the “Customer” table 300 anddata corresponding to a ship-to customer's name and city as stored in“Customer” table 300 on the row having the corresponding ship-tocustomer ID. Likewise, each bill-to customer in “Sales” table 200 isreferenced according to a “Customer ID” 310 present in the “Customer”table 300 and data corresponding to a bill-to customer's name and cityas stored in “Customer” table 300 on the row having the correspondingbill-to customer ID.

[0031] Referring now to FIG. 4, a prior art OLAP model of the data from“Sales” table 200 and “Customer” table 300 is shown as organized into adata cube 400. Cube 400 has a measure “Sales Quantity” 405 modeledaccording to the “Sales” type. Cube 400 has a first dimension 410modeled according to the “Customer” type and providing data according to“Ship-to Customer” attribute 210. Cube 400 also has a second dimension412 modeled according to the same “Customer” type and providing dataaccording to “Bill-to Customer” attribute 212. As should be appreciated,other dimensions (not shown) may also be present based on other OLTPtables (not shown).

[0032] In cube 400, first dimension 410 and second dimension 412 eachprovide data to one of the related attributes “Ship-to Customer” 210 and“Bill-to Customer” 212, respectively. Again, modeling a cube 400 withtwo dimensions 410 and 412 that each provides data to one of two relatedattributes 210 and 212 is a complex and time-consuming process because,for each dimension, data must be retrieved from multiple tables 200 and300. The complexity and time required to model the cube 400 would begreatly reduced if, rather than having two dimensions 410 and 412 thateach provide data to one of the related attributes 210 and 212, the cube400 had a single dimension that provided data to both related attributes210 and 212. Thus, the system and method of the present invention modelsa cube with a single dimension providing data according to both relatedattributes 210 and 212.

[0033] In particular and referring now to FIG. 5, in one embodiment ofthe present invention, analytical data cube 500 is an OLAP model of thedata from “Sales” table 200 and “Customer” table 300. As seen, cube 500has a measure “Sales Quantity” 505 modeled according to the “Sales”type. Cube 500 also has a single dimension 510 modeled according to the“Customer” type of the “customer” table 300 of FIG. 3, where suchdimension 510 provides data according to both “Ship-to Customer”attribute 210 and “Bill-to Customer” attribute 212 of “Sales” table 200.Unlike prior art data cube 400 of FIG. 4 that has two dimensions410-412, each modeled according to the “Customer” type, data cube 500 ofFIG. 5 has only the single dimension 510 modeled according to the“Customer” type and providing data for both the “Ship-to Customer”attribute 210 and the “Bill-to Customer” attribute 212, in effectplaying the role of each attribute 210 and 212 according to a particularquery. By eliminating the duplication of the second dimension, thesingle “role playing” dimension makes the cube easier to define andreduces both the time required to build the cube and the storage sizerequired for the data in the cube.

[0034] Dimension 510 may have a dimension hierarchy represented in agrossly simplified fashion by data tree 600 as shown in FIG. 6. Nodes610 and 611 in the top row are “State” nodes corresponding to “State”attribute 316 in FIG. 3. Nodes 620-624 in the second row are “City”nodes corresponding to “City” attribute 314 in FIG. 3. Nodes 630-637 inthe third row are “Name” nodes corresponding to “Name” attribute 312 inFIG. 3. Nodes 640-649 in the fourth row are “Customer ID” nodescorresponding to “Customer ID” attribute 310 in FIG. 3. As should beappreciated, one advantage of a hierarchically organized analytical datamodel is that such a model allows data to be aggregated in response to aquery. In particular, data aggregated according to the dimensionhierarchy of FIG. 6 may be aggregated based on any of the levels in thehierarchy. For example, if a query requires a sales quantity for aspecific city, then data may be aggregated by “City” attribute 314 andthe second row of data tree 600.

[0035] Using cube 500 in response to a query should be apparent to therelevant public. Accordingly, no particular example is provided.Generally, based on whether a particular query requests data accordingto a bill-to customer or a ship-to customer, the dimension acts to playthe role of each bill-to customer or each ship-to customer,respectively.

[0036] The programming necessary to effectuate the processes performedin connection with the present invention is relatively straight-forwardand should be apparent to the relevant programming public. Accordingly,such programming is not attached hereto. Any particular programming,then, may be employed to effectuate the present invention withoutdeparting from the spirit and scope thereof.

[0037] While the invention has been described and illustrated withreference to specific embodiments, those skilled in the art willrecognize that modifications and variations may be made withoutdeparting from the principles of the invention as described above andset forth in the following claims. For example, while the invention hasbeen described with reference to a “Sales” table and a “Customer” tablesthe invention may be used in conjunction with any table from arelational database. Furthermore, the analytical data models of thepresent invention may comprise any number of dimensions corresponding toany number of data attributes. Accordingly, reference should be made tothe appended claims as indicating the scope of the invention.

We claim:
 1. A method in combination with first and second tables ofdata, the first table organizing a first type according to a firstattribute, and a second attribute and the second table organizing asecond type according to a third attribute, the first attribute of thefirst table being related to the third attribute of the second tablesuch that the first table may be cross-referenced to the second tablethereby, the second attribute of the first table being related to thethird attribute of the second table such that the first table may becross-referenced to the second table thereby, the method comprising:modeling a measure according to the first type of the first table;modeling a dimension according to the second type of the second table;tying the measure to the dimension according to the first attribute ofthe first table and the third attribute of the second table to allow themeasure to be analyzed by the dimension according to the firstattribute; and tying the measure to the dimension according to thesecond attribute of the first table and the third attribute of thesecond table to allow the measure to be analyzed by the dimensionaccording to the second attribute, whereby the dimension provides dataaccording to both the first attribute of the first table and the secondattribute of the first table.
 2. The method of claim 1, comprisingmodeling a measure according to the first type of the first table, thefirst table comprising data stored in a relational database.
 3. Themethod of claim 1, comprising modeling a dimension according to thesecond type of the second table, the second table comprising data storedin a relational database.
 4. The method of claim 1, comprising modelinga dimension according to the second type of the second table, the secondtable having data organized hierarchically therein.
 5. The method ofclaim 4, further comprising aggregating the data of the dimensionaccording to the hierarchical organization of the second table.
 6. Themethod of claim 1, comprising modeling a measure of a data cubeaccording to the first type of the first table.
 7. The method of claim6, comprising modeling a measure of a data cube formatted for onlineanalytical processing according to the first type of the first table. 8.The method of claim 1, comprising modeling a dimension of a data cubeaccording to the second type of the second table.
 9. The method of claim8, comprising modeling a dimension of a data cube formatted for onlineanalytical processing according to the second type of the second table.10. A computer readable medium having stored thereon computer readableinstructions in combination with first and second tables of data, thefirst table organizing a first type according to a first attribute, anda second attribute and the second table organizing a second typeaccording to a third attribute, the first attribute of the first tablebeing related to the third attribute of the second table such that thefirst table may be cross-referenced to the second table thereby, thesecond attribute of the first table being related to the third attributeof the second table such that the first table may be cross-referenced tothe second table thereby, the computer readable instructions forperforming the following steps: modeling a measure according to thefirst type of the first table; modeling a dimension according to thesecond type of the second table; tying the measure to the dimensionaccording to the first attribute of the first table and the thirdattribute of the second table to allow the measure to be analyzed by thedimension according to the first attribute; and tying the measure to thedimension according to the second attribute of the first table and thethird attribute of the second table to allow the measure to be analyzedby the dimension according to the second attribute, whereby thedimension provides data according to both the first attribute of thefirst table and the second attribute of the first table.
 11. Thecomputer readable medium of claim 10, comprising instructions forperforming the step of modeling a measure according to the first type ofthe first table, the first table comprising data stored in a relationaldatabase.
 12. The computer readable medium of claim 10, comprisinginstructions for performing the step of modeling a dimension accordingto the second type of the second table, the second table comprising datastored in a relational database.
 13. The computer readable medium ofclaim 10, comprising instructions for performing the step of modeling adimension according to the second type of the second table, the secondtable having data organized hierarchically therein.
 14. The computerreadable medium of claim 13, further comprising instructions forperforming the step of aggregating the data of the dimension accordingto the hierarchical organization of the second table.
 15. The computerreadable medium of claim 10, comprising instructions for performing thestep of modeling a measure of a data cube according to the first type ofthe first table.
 16. The computer readable medium of claim 15,comprising instructions for performing the step of modeling a measure ofa data cube formatted for online analytical processing according to thefirst type of the first table.
 17. The computer readable medium of claim10, comprising instructions for performing the step of modeling adimension of a data cube according to the second type of the secondtable.
 18. The computer readable medium of claim 17, comprisinginstructions for performing the step of modeling a dimension of a datacube formatted for online analytical processing according to the secondtype of the second table.
 19. A system in combination with first andsecond tables of data, the first table organizing a first type accordingto a first attribute, and a second attribute and the second tableorganizing a second type according to a third attribute, the firstattribute of the first table being related to the third attribute of thesecond table such that the first table may be cross-referenced to thesecond table thereby, the second attribute of the first table beingrelated to the third attribute of the second table such that the firsttable may be cross-referenced to the second table thereby, the systemcomprising: a processor operative to execute computer executableinstructions; and memory having stored therein computer executableinstructions for performing the following steps: modeling a measureaccording to the first type of the first table; modeling a dimensionaccording to the second type of the second table; tying the measure tothe dimension according to the first attribute of the first table andthe third attribute of the second table to allow the measure to beanalyzed by the dimension according to the first attribute; and tyingthe measure to the dimension according to the second attribute of thefirst table and the third attribute of the second table to allow themeasure to be analyzed by the dimension according to the secondattribute, whereby the dimension provides data according to both thefirst attribute of the first table and the second attribute of the firsttable.
 20. The system of claim 19, comprising computer executableinstructions for performing the step of modeling a measure according tothe first type of the first table, the first table comprising datastored in a relational database.
 21. The system of claim 19, comprisingcomputer executable instructions for performing the step of modeling adimension according to the second type of the second table, the secondtable comprising data stored in a relational database.
 22. The system ofclaim 19, comprising computer executable instructions for performing thestep of modeling a dimension according to the second type of the secondtable, the second table having data organized hierarchically therein.23. The system of claim 22, further comprising computer executableinstructions for performing the step of aggregating the data of thedimension according to the hierarchical organization of the secondtable.
 24. The system of claim 19, comprising computer executableinstructions for performing the step of modeling a measure of a datacube according to the first type of the first table.
 25. The system ofclaim 24, comprising computer executable instructions for performing thestep of modeling a measure of a data cube formatted for onlineanalytical processing according to the first type of the first table.26. The system of claim 19, comprising computer executable instructionsfor performing the step of modeling a dimension of a data cube accordingto the second type of the second table.
 27. The system of claim 26,comprising computer executable instructions for performing the step ofmodeling a dimension of a data cube formatted for online analyticalprocessing according to the second type of the second table.